IN THE SPECIFICATION : 

Please substitute the following specification for the specification submitted with the 
application. A clean copy of the substitute specification is provided immediately following the 
signature page of this amendment. 

DECISION MAKING IN CLASSIFICATION PROBLEMS 

Field of the Invention 

The invention relates to decision making in classification problems and relates 
particularly, though not exclusively, to improved methods of classification in decision fusion 
applications. 

Background of the Invention 

Decision fusion is a widely used technique for several kinds of classification applications 
such as, for example, medical imaging, biometric verification, signature or fingerprint 
recognition, robot vision, speech recognition, image retrieval, expert systems etc. 

Generally, in decision fusion applications, multiple classifiers (or experts) perform 
separate classification experiments on respective data sets, and consequently designate a 
nominated class as correct. The classifier decisions are then combined in a predetermined 
selection strategy to arrive at the final class, as described below. Two extreme approaches for 
the combination strategy are outlined below: 

1 . The first approach may accept the decision of the majority of the classifiers as the final 
decision (decision consensus approach). 
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2. The second approach can take the decision of the most competent expert as the final decision 
(most competent expert approach). 

An intermediate approach involves determining a solution in which a consensus decision 
is evaluated in terms of the past track records of the experts. Instead of directly accepting the 
consensus decision, the reliability of each decision is evaluated through various kinds of 
confidence measures. The decision is either accepted or rejected based on the result of such an 
evaluation. 

In a further approach, a Bayesian cost function is minimis e d minimized over all the 
decisions given by the experts. The cost function is defined as the cost of making a wrong 
decision multiplied by the joint probability of occurrence of the respective decisions. 

None of the above approaches outlined above are rigourously rigorously optimal or 
universally applicable, and can be subject to errors or limitations of one kind or another. 
Accordingly, it is an object of the invention to at least attempt to address these and other 
limitations associated with the prior art. In particular, it is an object of the invention to generally 
improve the classification accuracy of particular decision fusion applications which rely on one 
of the prior art approaches outlined above. 

Summary of the Invention 

The inventive concept is founded in recognition that the reliability of a classifier in a 
decision fusion architecture can vary from sample to sample and from experiment to experiment. 
The inventive concept involves using the decisions from multiple classifiers in a decision fusion 
application to make an informed as to the classifier which is likely to be correct. 

More particularly, the inventive concept resides in [a] recognition that a strategy of 
assigning confidences to different classifiers in a decision fusion architecture can be used to 
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improve the classification accuracy of a decision fusion application. This inventive strategy is 
thought to r e sult results in improved classification accuracy as compared to the case where static 
confidence measures (or weights) for classifiers are used across samples during the experiment 
or even across the experiments . 

Embodiments of the invention involve att e mpting to optimally adapt adapting the weight 
given to a particular classifier from sample to sample, which generally results in improved 
performance compared with prior art approaches. A weight or metric of relative confidence is 
assign e d to computed for every classifier , d e p e nding upon by determining its sample confidence 
and overall confidence (as subsequently described). For each class, an overall score (or 
likelihood) is calculated which combines individual scores from all classifiers, which allows the 
class with the highest score (or likelihood) to be designated as the correct class. 

The invention provides a method suitable for deciding how to classify a sample in one of 
a number of predetermined classes, the method comprising: 

(a) associating calculating a weight wy with e ach of a plurality of classifiers i for a 
classifier i (I < i < IICII) where 1ICII is the cardinality of the set C. which are class models for how 
to classify a sample j in one of a number of predetermined classes k K; 

(b) calculating for each of said predetermined classes k K, a weighted summation CL jk (I 
<k<||KII) ( given below) across said classifiers i of the likelihood l ijk that the sample j belongs te 
that respective class k, to class k as given bv classifier i. weighted by the weight Wjj[;]. Here IIK| | 
is the cardinality of the set of classes C: 

c 

CLjk = ^ wy * Ljk 

i=l 

and 

(c) designating the sample j as belonging to the class k which has an associated weighted 
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summation of lik e lihoods CL^ which for which CLj b is greatest in value. 

The invention further provides an apparatus for classifying a data sampl e in ono of a 
numb e r of pr e d e t e rmin e d classes, th e apparatus comprising: input moans to r e ceive data; and 
processor m e ans for calculating associating a w e ight w^ with e ach of a plurality of classifi e rs i 
which are class mod e ls for how to classify a sampl e j in on e of a number of pr e determin ed 
class e s k, and for d e signating calculating for e ach of said predetermined classes k a weighted 
summation CL& across said classifiers i of th e lik e lihood lp that th e sample b e longs to that 
r e sp e ctiv e class k, weight e d by the w e ight determining the weight w y for classifier i and 
sample i. The weight can be derived from a metric of relative confidence in the decision of [a] 
the respective classifier i. Preferably, this is an L-statistic (linear combination of the order 
statistic), which represents the statistical separation among the order statistic, preferably log- 
likelihoods, against the class models for [a] the classifier. 

This determination of relative confidence can b e is performed in by two different ways 
methods to calculate two components of w e ight giv e n to th e decision of a classifier i, the weight, 
referred to as sample confidence Ly and overall confidence Hj. Pr e f e rably, these These 
confidence values of the classifi e rs classifier i are subsequently used to combine the decisions 
from all the classifiers i to obtain the final decision. 

The L-statistic, for a particular sample j, Ly, can be defined as: 

Ljj = ail' m + ai l'jg + . ■ .+ aKf ijK 
where 4p denotes for sample j and classifier i, the log-likelihood of the kth most likely class 
is such that the l^s form an order statistic, that is ip^>4p > ... >l p 1' ^ > ... > T ^. 
The values of [ais] a k fl <=k<=HKIP> define the form of the particular L-statistic Ly chosen. 
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Preferably, the order statistic used is simply the difference between the log-likelihoods of the two 
most likely classes k. That is, aj = 1, a 2 = -land all other fas] values of a^ = 0. 

A cumulative mean Hi of the sample confidence confidences Ly over a large number of 
samples is used to measure the overall discrimination capability of the classifier[.] and forms the 
second component of the weight w^ . 

H^Ly/t 

j=i 

It is currently understood that the value of the overall confidence H i5 so calculated converges to a 
constant value which is well separated for different overall confidence levels. 

Ov e rall confid e nc e for classifi e r i, Hi, is computed as cumulative mean or moving average 
of th e L statistic ov e r a numb e r of sampl e s j after which it b e com e s almost constant. 

In the equation directly above t is the number of samples after which the overall 
confidence value stabilis e s converges to a constant . Hi attempts to model some kind of 
disturbance or noise which is application specific. Typically, such noise degrades the efficiency 
of the classifier across all classes. For example, in the case of speech recognition, this may be 
ambient noise (such as car noise, cocktail party noise) present in the audio channel. There may 
be, of course, some cases in which the amount of noise present in the classifier varies during the 
experiment. 

For every incoming sample j, sample confidence values L^s L yO <i<l|C[[) are computed 
for every classifier i. The overall confidence Hi for the classifiers classes C are updated using 
Ly. Preferably, a weight wy is assigned to each classifier i as a function of the overall confidence 
Hj and the sample confidence Ly. Once weights wy for each classifier are known, each incoming 
sample j can be classified in a class k by calculating the combined log-likelihood CL jk for each 
class k, as set out directly below. 
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c 

CLjk = ^ wy * lijk 

where wy = f(Ly, Hi). 

For the sample j, the class k with the highest calculated combined log-likelihood CL jk is 
finally chosen as the correct class k for sample j. 33^4^ denotes the log likelihood of sample j 
for class k using classifi e r i. 

The invention also includes a computer program product for performing embodiments of 
the inventive methods described above. 

Embodiments of the invention can be used in various applications in which decision 
fusion is conventionally used. 

Brief Description of the Drawings 

Fig. 1 is a schematic representation of the process involved in reaching a decision in a 
classification problem, in accordance with an embodiment of the invention. 

Fig. 2 is a schematic representation of the process involved in determining a weight using 
a threshold value, in accordance with an embodiment of the invention. 

Fig. 3 is a schematic representation of computing hardware suitable for performing 
embodiments of the invention. 

Detailed Description of Embodiments and Best Mode 

An embodiment of the invention is described below in the context of an audiovisual 

c 

CLjk *— 1 ^ Wij * lijk 

speech recognition application which uses fusion for classification problems. In this context, 
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there are two relevant classifiers: audio and video. 

In overview, the classification or recognition process initially involves steps as outlined 
in Fig. 1. Initially, in step 10, the process involves associating calculating a metric of relative 
confidence with for respective classifiers or class models which predict how a sample should be 
r e cognised recognized . : ¥be Ly is calculated in step 20 as an L-statistic of the log-likelihoods l ijk , 
as detailed below. The moving average H i5 across a suitable number of samples j is then 
determined in step 30. This allows weights W^to be calculated in step 40 for each classifier 
from using Hi and Ly, according to a suitable function as detailed below. The combined 
likelihoods across classifiers CLj k are then calculated in step 50 as a weighted summation of the 
likelihoods of each class, so that the most likely class can then be determined in step 60. 

For the speech recognition application decision, the problem can be defined as follows. 
Given an audio and a video vector corresponding to a particular speech time frame it is 
necessary to determine the phone class to which this frame belongs. Phones are mod e ll e d 
modeled as GMM (Gaussian Mixture Models) obtained from the training data. 

Given an audio vector for speech frame j , its likelihood of corresponding to e ach of the 
phone classes is computed from the respective classification models. From these likelihoods, the 
L-statistic is preferably chosen simply as the difference between the first and the second most 
likely choices. As a result, coefficients [ai] a^are used as follows. 

ai = 1, a 2 = - 1 , all other [ais] values of a^ = 0 
A similar computation is also performed for the video vector. The L-statistic is shown as 
Ly in Fig. 2. The cumulative mean of L-statistic, Hi is used here to model the background noise 
present in the audio channel only, as background noise uniformly degrades the audio recognition 
rate across all phonetic sounds. Accordingly, the L-statistic Ly decreases uniformly in the 
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presence of noise. The combined likelihood ef corresponding to a particular phone class for 
the speech frame j, is computed as follows. 

CL { = Wa * Li + Wv * lv i 

CLjk = Waj * lajk + Wvj * Ivjk 

Here a l& and U 1 ^ and l y^are log likelihoods for ith audio class and ith vidoo class phone 
class k given by audio and video respectively^a-aftd-W y; and and W yj are the weights 
assigned to the audio and video lik e lihoods r e sp e ctively classifier respectively for speech frame 
i- The phone class k with the highest combined likelihood CL ^ is selected as the correct phone 
class. 

The weight for audio is determined and, since there are only two classifiers in this case, 
the weight for video is simply determined as the complement of the weight for audio, as the 
linear summation of all weights is 1 . A threshold ai is defined for sample confidence values of 
audio which are just the L-statistic in this case. First, the class confidence value for audio is 
checked against its threshold in step 100. If it passes this test, audio weight is computed in step 
1 10 as a constant term and a term which is dependent on the overall confidence of the audio 
channel. If audio fails this test in step 120, the constant term in the weight changes. 

Hence this embodiment, function f( ) is implemented as 

w B =f(L s , H i ) = fi(L ij ) + f 2 (Hi) 
where f i ( ) is chosen as a threshold function and f 2 ( ) is given as 

f 2 (H i ) = x 1 /(l+exp(x 2 *Hi)) 

Parameters xi and x 2 are scalar values that are selected and, if necessary, adjusted to 
provide good performance. Preferably, sample confidence is used as a confidence measure for a 
classifier for the current sample being processed. The sample confidence models non-uniform 
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discrimination capability of the classifier across various classes due to the non-uniform 
dispersion of the clusters in vector space for the data set of the classifier. The sample confidence 
does not represent the overall discrimination capability of the classifier. A low value of the 
sample confidence indicates low confidence in its decision for the present sample. Similarly, a 
high value of the sample confidence indicates a higher confidence in its decision for that sample. 
The sample confidence for the present sample is preferably represented by the L-statistic for the 
sample. 

Preferably, overall confidence represents the overall discrimination capability of the 
classifier across all classes (or clusters). This overall discrimination capability may vary 
between experiments degraded due to the presence of noise which uniformly degrades the 
classifier's discrimination capability across all classes. For example, in the case of speech 
recognition, this may be background noise present in the audio channel. 

In this application, it is possible to achieve improvements in phonetic classification 
results using the techniques of the described embodiment of the invention. Computer hardware 
for performing embodiments of the invention is now described. 

The described process of classification can be implemented using a computer program 
product in conjunction with a computer system 200 as shown in Fig. 3. In particular, the process 
can be implemented as software, or computer readable program code, executing on the computer 
system 200. 

The computer system 200 includes a computer 250, a video display 210, and input 10 
devices 230, 232. In addition, the computer system 200 can have any of a number of other 
output devices including line printers, laser printers, plotters, and other reproduction devices 
connected to the computer 250. The computer system 200 can be connected to one or more other 
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computers via a communication input/output (I/O) interface 264 using an appropriate 
communication channel 240 such as a modem communications path, an electronic network, or 
the like. The network may include a local area network (LAN), a wide area network (WAN), an 
Intranet, and/or the Internet 220. 

The computer 250 includes the control module 266, a memory 270 that may include 
random access memory (RAM) and read-only memory (ROM), input/output (I/O) interfaces 264, 
272, a video interface 260, and one or more storage devices generally represented by the storage 
device 262. The control module 266 is implemented using a central processing unit (CPU) that 
executes or runs a computer readable program code that performs a particular function or related 
set of functions. 

The video interface 260 is connected to the video display 210 and provides video signals 
from the computer 250 for display on the video display 210. User input to operate the computer 
250 can be provided by one or more of the input devices 230, 232 via the I/O interface 272. For 
example, a user of the computer 250 can use a keyboard as I/O interface 230 and/or a pointing 
device such as a mouse as I/O interface 232. The keyboard and the mouse provide input to the 
computer 250. The storage device 262 can consist of one or more of the following: a floppy 
disk, a hard disk drive, a magneto-optical disk drive, CD-ROM, magnetic tape or any other of a 
number of non-volatile storage devices well known to those skilled in the art. Each of the 
elements in the computer system 250 is typically connected to other devices via a bus 280 that in 
turn can consist of data, address, and control buses. 

The method steps for are e ff e ct e d affected by instructions in the software that are carried 
out by the computer system 200. Again, the software may be implemented as one or more 
modules for implementing the method steps. 
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In particular, the software may be stored in a computer readable medium, including 10 
the storage device 262 or that is downloaded from a remote location via the interface 264 and 
communications channel 240 from the Internet 220 or another network location or site. The 
computer system 200 includes, the computer readable medium having such software or program 
code recorded such that instructions of the software or the program code can be carried out. The 
use of the computer system 200 preferably e ff e cts affects advantageous apparatuses for 
constructing a runtime symbol table for a computer program in accordance with the 
embodiments of the invention. 

The computer system 200 is provided for illustrative purposes and other configurations 
can be employed without departing from the scope and spirit of the invention. The foregoing is 
merely an example of the types of computers or computer systems with which the embodiments 
of the invention may be practised practiced . Typically, the processes of the embodiments are 
resident as software or a computer readable program code recorded on a hard disk drive as the 
computer readable medium, and read and controlled using the control module 266. Intermediate 
storage of the program code and any data including entities, tickets, and the like may be 
accomplished using the memory 270, possibly in concert with the storage device 262. 

In some instances, the program may be supplied to the user encoded on a CD-ROM or a 
floppy disk (both generally depicted by the storage device 262), or alternatively could be read by 
the user from the network via a modem device connected to the computer 250. Still further, the 
computer system 200 can load the software from other computer readable media. This may 
include magnetic tape, a ROM or integrated circuit, a magneto-optical disk, a radio or infra-red 
transmission channel between the computer and another device, a computer readable card such 
as a PCMCIA card, and the Internet 220 and Intranets including email transmissions and 
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information recorded on Internet sites and the like. The foregoing are merely examples of 
relevant computer readable media. Other computer readable media may be practised practiced 
without departing from the scope and spirit of the invention. 

Further to the above, the described methods can be realised realized in a centralised 
centralized fashion 10 in one computer system 200, or in a distributed fashion where different 
elements are spread across several interconnected computer systems. 

Computer program means or computer program in the present context mean any 
expression, in any language, code or notation, of a set of instructions intended to cause a system 
having an information processing capability to perform a particular function either directly or 
after either or both of the following: a) conversion to another language, code or notation or b) 
reproduction in a different material form. 

In the foregoing manner, a method, an apparatus, and a computer program product for are 
disclosed. While only a small number of embodiments are described, it will be apparent to those 
skilled in the art in view of this disclosure that numerous changes and/or modifications can be 
made without departing from the scope and spirit of the invention. 

It is understood that the invention is not limited to the embodiment described, but that 
various alterations and modifications, as would be apparent to one skilled in the art, are included 
within the scope of the invention. 
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